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A  UNEAR  THEORY  FOR  NON-CAUSALITY 
J.  F.  Florens  and  M.  Mouchart 

1.1.  Introduction. 

Following  Granger's  (1969)  and  Sins'  (1972)  papers,  the  non-causal lty 

concept  has  taken  on  great  Importance  In  econometrics  literature.  This 

concept  is  essentially  the  same  as  the  concept  of  transitivity  Introduced 

into  statistics  by  Bahadur  (1954)  and  used  in  sequential  analysis  (see 

e.g.  Hall,  Wijsman  and  Gosh  (1965)).  Intuitively,  transitivity  can  be 

presented  in  the  following  way:  a  sub-process  (zQ)a  of  a  multivariate 

stochastic  process  (x  )  is  transitive  if  the  past  and  current  values 

n  n 

of  zq  are  sufficient  to  forecast  Equivalently,  if  xq  is 

partitioned  into  (zn,yQ),  we  say  that  the  process  generating  yQ  does 
not  cause  the  process  generating  bq. 

A  precise  statement  of  this  intuitive  definition  can  be  made  in 
different  ways.  In  some  of  our  previous  work,  non-causality  is  couched 
in  terms  of  sequences  of  Independence  conditions  between  o-flelds  (see 
Florens, Mouchart ,  1980a, b  and  Florens,  Mouchart,  Rolin,  1980).  In  this 
paper  we  propose  definitions  in  terms  of  sequences  of  orthogonality 
conditions  between  linear  subspaces  of  the  Hilbert  space  of  random  variables. 
This  kind  of  presentation  is  implicit  in  most  of  econometrics  papers  and  was 
explicitly  used  by  HoBoya  (1977).  (In  economics  or  econometrics  literature, 
see  also  e.g.  Gourleroux,  Montfort  (1980)  or  Futla  (1981)  in  which  the  same 
kind  of  mathematical  tools  are  used.  In  time  series  literature  this  kind 
of  presentation  is  very  comon.  See  e.g.  Anderson  (1971)). 
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The  main  purpose  of  this  paper  is  to  show  the  equivalence  of  several 
definitions  of  non-causality  (Granger  (1969),  Sims  (1972,  1980),  Haugh 
and  Pierce  (1977)).  These  authors  have  often  simultaneously  given  a 
definition  of  and  a  test  procedure  for  non-causality.  Ue  essentially 
analyze  here  the  relations  between  these  definitions.  Comparison  of  the 
properties  of  test  procedures  is  clearly  another  story.  For  example, 
we  shall  never  use  stationarity  assumptions  in  the  definitions  of  non¬ 
causality  or  in  the  proofs  of  their  equivalence.  However  stationarity 
is  a  crucial  assumption  in  test  procedures. 

An  Important  point  about  non-causality  is  its  relationship  with  the 
exogeneity  concept  used  in  econometrics  literature  or,  more  generally, 
with  the  theory  of  sufficiency  and  ancillarity  in  sequential  models. 

This  relationship  was  the  main  topic  of  our  previous  papers  (Florens, 

Mouchart  1980a,  Florens,  Mouchart,  Rolln  1980)  in  any  of  which  a  bibliog¬ 
raphy  can  be  found.  In  particular,  the  relationship  with  exogeneity  is 
studied  in  Florens,  Mouchart  (1980c)  and  in  a  paper  by  Engle,  Hendry, 

Richard  (1980).  So  this  point  will  not  be  treated  here. 

This  paper  is  organized  in  the  following  way.  Notation  is  presented 
in  the  second  part  of  the  introduction.  Section  2  is  devoted  to  the  original 
definition  of  Granger  and  to  the  main  properties  of  this  concept.  Sims' 
first  definition,  Haugh,  Pierce's  definition  and  their  respective  equi¬ 
valences  to  Granger's  definition  are  given  in  Sections  3  and  4.  In  Section 
5  links  between  non-causality  and  rational  expectations  are  pointed  out 
and  Sims'  second  definition  is  presented  and  shown  to  be  equivalent  to 
Granger ' s.  Definitions,  notation  and  results  on  orthogonality  in  a  Hilbert 
space  are  recalled  in  the  appendix. 

2 

- 


1.2.  Notations 


2 

Let  (n,G,P)  be  a  probability  space  and  L  be  the  Hilbert  space 

of  square  lntegrable random  variables  (defined  P  -  almost  surely). 

In  this  paper  inner  product,  orthogonality,  projection,  completion... 

2 

are  relative  to  the  canonical  structure  of  L  (see  any  book  on  probability 

theory  e.g.  Neveu  (1964)).  For  simplicity  we  restrict  our  presentation 

to  a  bivariate  discrete  stochastic  process  (x  )  ■  (y  ,  z  )  n>0,l...  i.e. 

n  n  n  n  n 

to  a  double  sequence  of  random  variables  (yQ)Q  and  (zn)n*  All  random 

2 

variables  considered  are  assumed  to  be  elements  of  L  .  (Note  that  random 
variables  are  defined  only  almost  surely,  so  we  in  fact  consider  a  class 
of  stochastic  processes  such  that  each  is  a  modification  of  others). 

It  must  be  pointed  out  that  the  time  index  belongs  in  k  ■  {0,1,...} 
and  not  in  2-  {...,-1,0,1,...}.  (Continuous  time  is  another  story!) 

This  hypothesis  does  not  limit  our  results  and  has  the  advantage  of 
making  clear  the  scale  of  initial  conditions.  In  fact  H  must  be 
completed  with  a  maximum  element  00 .  If  7  is  the  time  index,  it  must 
be  completed  by  a  minimum  element  -00  and  a  maximum  element  60 .  So  if 

the  index  time  is  2,  and  0  if  the  index  time  is  K  ,  play  the  same 
role,  (in  Florens-Mouchart  (1980  b)  details  can  be  found  about  this  modi¬ 
fication  of  the  time  index). 

Let  (y  )  .  be  a  stochastic  process.  He  denote  by  y®  (n  <_  m) 

n  n*09X|«M  “ 

the  linear  subspace  of  L  generated  by  y  , ...y  .  (For  example,  y 

u  w  n 

is  the  subspace  generated  by  yQ).  If  m  is  finite,  such  finlte-dlmen- 

00 

sional  subspaces  are  closed.  yfl  denotes  the  closed  subspace  generated 

by  7.,  y. .  •  y»  represents  the  history  (in  the  sense  of  all  linear 

functions  of  the  past)  of  y  .  Similar  notation  is  used  for  (*  )_• 

tt+l  n  “ 
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2.  Granger’s  Non-Causality 

For  expository  purposes,  we  first  recall  Granger's  (1969)  concept 
of  non-causality  along  with  some  of  its  main  properties. 

Definition  2.1.  y  does  not  linearly  cause  z  iff 

(2.1)  ¥  n  >  0:  zft+1  £  y“  |  z"  +  u  □ 

In  this  definition,  u  may  represent  initial  conditions  and  any  other 
relevant  information  to  be  used  by  means  of  linear  functions.  Typically 
u  will  include,  at  least,  the  constant  functions,  and  also  any  informa¬ 
tion  available  at  the  start  of  the  process.  Information  that  becomes 
available  later  will  be  introduced  in  section  5.  u  is  a  closed  linear 
subspace  of  L^. 

For  Instance,  when  u  contains  the  constant  functions  only,  u 
may  be  dropped  in  definition  1  if  either  of  the  processes  y  or  z  have 
zero-mean.  From  the  definition  of  conditional  orthogonality  (see  Theorem 
A.  1  in  the  appendix),  condition  (2.1)  allows  several  readings.  The 
projection  of  *n+^  onto  the  linear  space  y”  +  z^  +  u  is  contained  In 
the  linear  space  z”  +  u.  Alternatively,  the  residual  of  the  projection 
of  onto  the  linear  space  z”  +  u  is  orthogonal  to  the  linear 

space  y“. 

By  theorems  A. A  and  A. 8,  (2.1)  is  equivalent  to  any  one  of  the  following 
properties: 


V  n  >  0: 
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n  . 
*0  + 


u 


(2.2) 


(2.3) 

¥  (n,p) ,  0  <  p  <  n: 

*n+l  J 

Lyp  1 

I  n  . 

1  *0  +  u 

(2.4) 

V  (n,p) ,  0  <  p  <  n: 

■r  j 

o.  o 

_ i 

n  , 

Zg  +  U 

If  (2.1)  -  or  (2.2)  or  (2.3)  or  (2.4)  -  Is  not  satisfied,  we  shall 
say  that  "y  linearly  causes  z".  Linear  causality  and  non-causality  enjoy 
several  interesting  properties. 

Property  2.2.  For  any  process  Zj  z  does  not  linearly  cause  z . 
Indeed,  by  theorem  Al, 

(2.5)  »u  *«>  0=  I  •”  +  «  . 

Property  2.3.  Linear  causality  is  not  transitive. 

In  other  words,  y  linearly  causes  z  and  z  linearly  causes 
w  do  not  together  imply  y  linearly  causes  w.  (Note  that  "transi¬ 
tivity"  is  used  here  in  the  usual  algebraic  sense) . 

We  shall  call  "y  and  s"  the  set  of  processes  (ay  +  $z  )  for 

“  n  n 

any  a  and  8  in  R.  This  set  may  also  be  viewed  as  the  linear  space 
generated  by  yQ  and  «n»  The  history  of  "y  and  s"  up  to  the  Instant 
n  is  defined  as  yg  +  Zg.  In  other  words,  "y  and  z"  represents  the 
set  of  information  obtainable  linearly  from  the  observations  yQ  and 
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Property  2,4.  y  and  z  does  not  linearly  cause  v  If  and  only 
If  neither  y  nor  z  linearly  causes  w. 

In  other  words,  and  this  is  a  direct  Implication  of  theorem  A.3, 
one  has 

(2.6)  V  n  >  0:  wn+1  J_  (y£  +  z“>  I  w“  +  u 

If  and  only  if 


Wn+1  1  y0  I  W0  +  U  V  n  -  ° 

and 

n.  |i“  I  w“+u  V  n  >  0 

Property  2.5.  y  does  not  linearly  cause  z  and  w  does  not 
Imply  that  y  does  not  linearly  cause  z  (or  that  y  does  not  linearly 
cause  w) . 

It  is  therefore  possible  that  y  linearly  causes  z,  y  linearly 
causes  w  but  that  y  does  not  linearly  cause  z  and  w.  In  other 
words: 

(2.7)  »  n  >  0  ,  »  <a,6>  «  *2  ! 

(avi +  «vi>  I  i  +  +  u 

does  not  Imply: 

■»«  l  A  i  *o +  u  • 
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Property  2.6.  y  linearly  causes  z  and  w  does  not  Imply  that 
either  y  linearly  causes  z  or  that  y  linearly  causes  w. 

It  Is  therefore  possible  that  y  linearly  causes  z  and  v  and 
that  neither  y  linearly  causes  z  nor  y  linearly  causes  w. 

Definition  2.1  has  suggested  testing  for  non-causality  by  testing 
the  following  property: 

(2.6)  zn+1  1  y“_p  |  z“_q  +  u  n  >  max(p,q) 

for  some  fixed  value  of  p  and  q.  In  general  (2.1)  does  not  inply  and 
is  not  Implied  by  (2.8).  Therefore  such  a  test  may  be  justified  only  by 
maintaining  some  supplementary  hypotheses.  These  may  be  obtained  by 
means  of  the  following  theorem. 

Theorem  2.7.  Property  (2.8)  is  true  for  any  p  under  the  following 


conditions: 

(2.1) 

i  n  i  n  . 

*0.1  1*0  1  *0  +  u 

(2.9) 

i  n-q+l  i  n  n 

En+1  1  z0  1  y0  +  Vq  +  U 

(2.10) 

<yS  +  Vq  +  u>  n  (£S  +  U)  -  Vq 

Proof.  (2.1)  and  (2.9)  Imply  that: 


(2.11) 


(y“  +  z“  +  u)  *n+l€  (Z0  +  U>  0  (yO+Vq  +  U> 
and,  by  (2.10),  the  l.h.s.  of  (2.11)  belongs  to  (zn_  +u).  i.e.: 

(2.12)  (yJ+zo  +  u)  2n+lC  <Vq  +  U> 

i.e. 

<2*13>  Zn+li  (y0+  *0^  1  Vq  +  U  * 

Clearly  (2.13)  Implies  (2.8)  for  any  p.  □ 

The  role  of  theorem  2.7  may  be  viewed  as  follows.  Condition  (2.10) 

means  that  any  linear  function  of  (y^Zj.u:  0  <_  1  <_  n,  n-q  £.  j  <.  n) 

that  is  a.s.  equal  to  a  linear  function  of  (z^u:  0  1  <_  n)  is  a.s.  a 

function  of  (z^.u:  n-q  £  i  n)  only.  This  condition  Implies,  but  is 

not  equivalent,  to  the  following  property:  the  only  linear  functions  on 

a.s.  equal  to  linear  functions  on  z°  are  a.s.  equal  to  linear  functions 

on  (zn  +  u) .  Condition  (2.10)  may  be  viewed  as  a  linear  form  of  "measureable 
n-q 

separability"  as  defined  in  Mouchart  and  Rolln  (1979)  and  may  be  termed 
"linear  separability”.  The  purpose  of  this  condition  Is  to  avoid  patho¬ 
logies  that  could  link  the  y-process  and  the  z-process. 

For  Gaussian  processes,  condition  (2.9)  may  be  viewed  as  a  Markovian 
condition  of  order  q  for  the  conditional  process  generating  (z^^ly^  +  u) 

For  general  processes,  condition  (2.9)  may  be  viewed  as  "linear"  Markovian 

n  v 

condition  on  the  residual  of  the  projection  of  zn+^  on  (y^  +  u). 


8 


Theorem  2.7  suggests  testing  for  non-causality  under  the  maintained 

hypotheses  (2.9)  and  (2.10)  by  testing  (2.8)  for  some  fixed  (p,q). 

This  test  is  generally  performed  by  testing  the  significance  of  the 

coefficients  of  y^, ...,yn_p  1x1  the  regression  of  zn+1  on  yn»***»yn_p» 

z  ,...,z  ,u  (u  is  in  this  case  the  constant  term).  Note  that  neither 
n  n— q 

(2.9)  nor  (2.10)  Involve  an  assunption  of  stationarity  but  stationarity 
implies  that  (2.9)  may  be  approximately  satisfied  for  large  values  of  q 
(see,  e.g.,  Rozanov  (1967)).  For  autoregressive  processes  (of  order 
smaller  than  or  equal  to  q)  condition  (2.9)  will  be  exactly  satisfied, 
whether  the  process  is  stationary  or  not. 
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Sims  (1972)  obtained  the  following  result. 

Theorem  3.1.  y  does  not  linearly  cause  z  if  and  only  if 

(3.1)  V  n  >  0  z“  i  yn  I  zq  +  u  • 

Proof .  A  general  proof  of  this  result  was  given  by  Hosoya  (1977) 
but  we  have  found  the  following  proof  both  simple  and  Insightful. 

From  (2.2),  (2.1)  is  equivalent  to: 

(3.2)  Vn>0  zS+1lyolZ0  +  U- 

This  Implies  (3.1)  because,  by  Theorem  A. 10,  (3.2)  is  equivalent  to: 

(3.3)  V  n  >  0  zq  1  y0  I  Z0  +  U  ' 

Reciprocally  by  Theorem  A3,  (3.1)  implies: 

(3.4)  Vn>^0,  V  p  >  0  z“  1  yp  |  z^  +  u 

and  this  Implies  (3.2)  by  theorem  A. 8  and  because  y^  is  generated  by 
(yp:  0  £  p  £  n).  □ 

The  easiest  interpretation  of  3.1  may  be  the  following:  the  projection 
00 

of  yR  onto  Zq  +  u  (i.e.  the  best  linear  approximation  of  yn  ty  an 

oo  n 

element  of  Zq  +  u)  belongs  in  &q  +  u»  In  other  words,  only  the  past  and 
the  current  values  of  zq  are  relevant  to  explain  yQ. 

10 


Theorem  3.1  depends  crucially  on  linearity:  in  terms  of  Independence 
in  probability,  this  result  would  be  false  (See  e.g.  Florens  and  Moucbart, 
1980. b). 

An  Immediate  Implication  of  theorem  A. 11  is  the  following  result. 

Theorem  3.2.  The  following  properties  are  equivalent: 

(3.5)  V  n  >  0  *o  1-  yn  I  *0  +  yo_1  +  u 

(3.6)  Vn>  0  zQ  X  yQ  |  +  y®  +  u  . 

Therefore,  (3.5)  is  an  alternative  form  of  linear  non-causality  if  the 
initial  condition  y^  is  a  linear  function  of  u.  In  the  non-linear 
theory,  the  condition  analogous  to  (3.5)  has  been  introduced  as  a 
modified  Sims'  condition  so  as  to  obtain  a  condition  equivalent  to 
Granger's  (see  e.g.,  Chamberlain  (1980)  and  Florens  and  Moucbart  (1980 
a,  b  and  c) . 

Note  that  the  properties  (3.1),  (3.5)  and  (3.6)  are  not  modified  if 

OP  00 

Zq  is  replaced  by  (theorem  A. 2). 

For  practical  applications  or  for  hypothesis  testing  Sims'  definition 
(3.1)  can  be  replaced  by: 

(3.7)  « 

for  fixed  values  of  p  and  q.  In  general  (3.1)  and  (3.7)  are  not 
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equivalent  and  further  hypotheses  are  required  to  guarantee  an  laplication 
between  these  two  definitions.  A  theorem  analogous  to  theorem  2.7  could  be 
given  without  difficulty. 

An  interesting  problem  is  the  relationship  between  Granger's  and 

Sims'  definitions  when  the  dimensions  of  the  future  and  past  of  the 

processes  (2.8  and  3.7)  are  fixed.  It  should  be  noted  that,  in  general, 

there  is  no  relationship  between  these  definitions. 

The  following  theorem  shows  the  equivalence  between  (2.8)  and  (3.7) 

under  a  condition  on  the  marginal  process  generating  (z  )  . 

n  n 

Theorem  3.3.  Under  the  following  hypothesis: 

(3.8)  Vn  >  p+q  z  -  I  *n  , _ _  I  z”  +  u  . 

—  n  n+1  ■>-  n-(p+q)  1  n-q 

The  following  two  conditions  are  equivalent. 

(3.9)  V  n  >  max(p,q)  zn+1  J_  y“_p  |  z“  +  u 

(3.10)  V  n  >  max(p.q)  zn+l+1  i-  yn  I  Zq  +  U  * 

Proof,  a)  (3.9)  (3.10)  follows  from  the  following  property, 

which  will  be  proved  by  Induction. 

(3.11)  V  n  >  max(p,q)  V  j  ■  l,...,p+l  z|j^  y^  |  z°_^  +  u 
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(3.9)  Implies  (3.11)  with  j  -  1.  Let  us  assume  (3.11)  Is  true  for 
any  j  <_  p.  (3.9)  gives  us: 


(3.12) 


Vi+i  i  Cj-p  I  VjVu 


As  yn  «  y“^  _p,  (3.12)  Implies : 


(3.13) 


Wl  1  yn  1  + 


(3.8)  Implies 


(3.14) 


*n+j+l 


I  z»fj-q-l 

n-q 


zn+J 

n+j-q 


+  u 


and  by  theorem  A. 3,  (3.13)  and  (3.14)  Imply 


(3.15) 


z  .  .  I  y  |  zn+<  +  u 
n+j+1  J-  i  n-q 


or  equivalently 


(3.16) 


z  .  .  I  y  I  z?*3  +  z"  + 
n+j+1  J-  7n  1  j+1  n-q 


J+l 


and  by  thoerem  A. 3,  (3.11)  and  (3.16)  give: 


n+j+1 
n+1  7n  '  “n-q 


I  y  |  +  u 
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b)  (3.10)  ^  (3.9)  follows  from  the  following  property,  also 


verified  by  induction: 


(3.17)  V  n  >  max(p.q)  ¥  i  -  0,...,p  zn+i  1  y^-i  I  Zn-q  +  U  * 

(3.17)  is  true  for  i  ■  0  by  (3.10).  Let  us  assume  (3.17)  Is  satisfied 
for  any  i  <_  p-1.  Prom  (3.10)  we  get: 


(3.18) 


n-i+1  i  I  n-i-1 

i  .  y  .  ,  z  .  ,  +u. 

n-i  n-i-1  1  n-i-l-q 


As  z^  c  z^  we  get  from  (3.18)  (by  corollary  A. 4) : 


(3.19) 


'n"J+p  I  y  .  .  I  zn  .  .  +  u 

n-i  n-i-1  '  n-i-l-q 


(3.20) 


*>  *  . ,  I  y  ,  .  I  :n  .  .  +  u 

n+1  n-i-1  1  n-i-l-q 


By  theorem  A. 3,  (3.8)  and  (3.20)  laply: 


(3.21) 


a.,  I  y  .  .  I  in  +  u 
n+1  n-i-1  1  n-q 


And  also  by  theorem  A. 3,  (3.17)  and  (3.21)  laply: 


(3.22) 


I  n  in 
zn+l  1  yn-i-l  I  zn-q  *  u 


(3.17)  is  then  verified  and  the  proof  is  completed.  Q 
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is  (linearly)  autoragraasive 


The  marginal  process  generating  (z&)Q>  q 
if 

(3.23)  V  n  >  q  *n+1  J_  |  *“_q  +  u  . 

(3.8)  implied  by  (3.23)  but  (3.8)  is  weaker  than  (3.23).  Note  tlmt 
theorem  3.3  cannot  be  stated  in  terms  of  conditional  Independence  instead 
of  conditional  orthogonality  because  the  proof  depends  crucially  on  (see 
parts  (l)(b)  and  (ii)(b)  of  theorem  A. 3)  the  fact  that,  with  the  same 
notation  as  in  the  appendix,  E.^  ]_  E2  |  E3  and  E^  _[_  E^  |  E3 
E1  -1-  (E2+E4^  I  E3*  T*lis  Property  has  no  equivalent  in  terms  of  condi¬ 
tional  Independence  (see  Florens  and  Mouchart  (1980. b)). 


4.  Haugh  and  Pierce’s  Non-Causallt 


Haugh  and  Pierce  (1977)  have  suggested  analysing  the  cross-correlations 
between  the  innovations  of  the  z-process  and  the  innovations  of  the 
y-process.  In  this  section  we  compare  the  approach  of  Baugh  and  Pierce 
and  linear  non-causality. 

In  our  notation,  the  Innovations  of  a  process  (yn)n  >  q  form  the 
process  denoted  yn~yg  ^yn  *  yn  i««*  the  difference  between  y^ 

and  its  projection  on  the  linear  space  of  all  the  linear  combinations  of 
yO’*",yn-l*  or*  alternatively.  c^e  projection  of  yQ  on  the  orthogonal 
complement  of  this  space  of  linear  combinations.  So  the  property  stated 
in  the  following  theorem  can  be  viewed  as  the  Haugh  and  Pierce  definition 
of  non-causality,  rewritten  in  our  notation. 


Proof.  We  first  rewrite  (4.1)  as  follows: 


(4.3) 


¥  n  >  0 
V  p  <  n 


<*“o  +  “)J"  *n+X  1  Vl  1  y0 


By  theorem  A. 9,  (4.3)  is  equivalent  to: 


(4.4) 


¥  n  >  0 


¥  p  <  n 


(2S  +  u)lzn+llyp+ll  y0 


P+1' 


i.e. 


(4-S)  (zn+l  1  2S  +  u)l(yp+lly0)  * 

¥  p  <  n  r 


and,  by  theorem  A. 8,  (4.5)  is  equivalent  to 


(4.6)  ¥  n  >  0  (zn+1  |  zj  +  u)  J_  (y“  |  y°) 


and  (4.6)  is  equivalent  to  (2.1),  by  theorem  A. 6,  if  yQ  6  Zq  +  u.  Q 
These  theorems  show  that  the  equivalence  between  Haugh  and  Pierce's 
condition  (4.1)  and  linear  non-causality  basically  depends  on  the 
specification  of  the  initial  condition  y^.  If  u  has  the  form 
u  -  yQ  +  v,  then  the  two  approaches  are  equivalent;  otherwise  linear 
non-causality  Implies,  but  is  not  implied,  by  condition  (4.1). 
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5.  Rational  Expectations  and  Non-Causality. 

Non-causality  may  be  viewed  as  the  condition  that  the  prediction  of 
based  only  on  its  own  history  z^  will  not  Improve  if  it  is  also 
based  on  the  past  history  of  y.  This  suggests  that  "y  does  not  cause 
z"  may  be  rephrased  as  "z  is  self-predictive  w.r.t.  y" (for  more  justifica¬ 
tion  see  e.g.  Florens  and  Mouchart  (1980c)). 

Consider  now  a  sequence  of  messages  and,  associated  with  it,  a  sequence 
of  "information  sets"  ln  representing  the  information  contained  in  all 

the  messages  up  to  instant  n .  Then  one  may  decompose  z  (or  y  )  into 

*  n  n 

an  "expected"  component  glver<  i.  an’  an  "unexpected"  one.  An  interesting 

Be 

question  is  to  analyze  non-cat^  %1  \ty  ln  terms  of  such  a  decomposition  for 

both  y  nnd  z.  This  was  the  object  of  a  recent  paper  by  Sims  (1980). 

In  a  linear  context-,  tbs  sequence  I  will  be  an  increasing  sequence 

2  n 

of  closed  subspaces  of  L  ;  often  will  have  the  form  wQ  where  {w^} 

is  a  sequence  of  "observations".  The  "expected"  component  z“n  of  zq 

becomes  the  projection  of  z  on  I  ,  and  its  "unexpected"  component  e 

n  n  n 

becomes  the  projection  of  zr  on  the  orthogonal  complement  of  Iq.  In 
other  words,  we  have  the  following  decomposition: 


(5.1) 


z 

n 


A 

2_ 


+  e 


A 

z 

n 


z 

n 


z 

n 


Similarly  for  yn: 
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(5.2) 


y  ■  y  +  n 
7n  7n  n 


y„  -  i  y„ 

n  no 

J_ 

T)  ■  I  y 
n  n  7n 


Remark.  It  should  be  pointed  out  that  a  closed  subspace  is  a  poor 
mathematical  translation  of  the  Intuitive  concept  of  Information.  Indeed, 
''knowing"  wq  should  Involve  knowing  all  (measurable)  transformations  of 
wq,  and  not  only  the  linear  ones.  In  other  words,  the  o-field  is  the 
natural  translation  for  the  concept  of  information.  In  the  present 
context,  representing  expectations  by  projection  on  a  space  of  linear 
functions  (defined  on  w°)  only  may  be  justified  on  the  grounds  of 
computational  simplicity  or  by  a  Gaussian  assumption.  Finally,  the 
natural  way  to  deal  with  an  increasing  sequence  of  a-fields  is  to  intro¬ 
duce  a  filtration  and  to  consider  as  the  rational  expectation  of  a  given 
process  the  nearest  process  adapted  to  that  filtration  (see,  e.g. 
Dellacherle-Meyer,  (1976),  see  also  Futla  (1981)). 

In  Sims  (1980),  the  information  Bets  IQ  are  taken  to  be: 

(5.3)  In  -  z“_1  +  y“-1  +  u  n  ^  1  . 


Therefore: 

(5.4)  eQ  -  (*q_1  +  yg"1  +  u)1-  *n  a  >  1 

(5.5)  nn  -  (*Q_1  +  yj"1  +  yn  n  t  1  • 


It  is  also  assumed  that  the  initial  conditions  Zq  and  y^  are  totally 
unexpected  l.e. 
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(5.6) 


e0  “  zo  and  no "  y0 


With  the  above  notation,  we  have  the  following  results. 


Theorem  5.1.  If  y  does  not  linearly  cause  z,  then: 


(5.7) 


V  n  >  o  zn+1  1  I  e£  +  u 


Proof .  Step  1.  We  first  prove: 


(5.8) 


V  n  >  1  £n  -  (z“_1  +  u^  zn 


We  note  that  condition  (2.1)  may  be  written  as: 


(5.9)  V  n  >_  1  (y"-1  +  z^"1  +  u)  zn  -  (z^-1  +  u)  zn 


This  is  equivalent  to  (5.8). 
Step  2.  We  now  prove: 


(5.10) 


n  .  n  „ 

Eq  +  u*Zq  +  u  n^O 


Given  (5.6),  this  is  trivial  for  n  ■  0.  Suppose  (5.10)  is  true  for 
some  n.  Then  from  (5.8),  we  have 


En+1  '  Vl  -  (!0  +  “>  *n+l 


which  clearly  belongs  to  z^  +  u. 


20 


Therefore,  under  (5.10)  for  some  n,  +  u  c  z^*1  +  u.  Conversely, 


from  (5.9)  again,  z 


n+1 


E^^  +  (Zq  +  u)  zn+^  which,  under  (5.10)  for  some 


n,  belongs  to  Gq+*  +  u;  therefore  under  (5.10)  for  some  n,  +  u  ^ 

n+1 

+u. 


Step  3.  We  now  prove: 


(5.11) 


Indeed, 


n  n  n 

no  c  yo  +  *0  +  u 


¥  n  >  0 


np  -  yp  -  (yq-1  +  *o_1  +  u)  yp  •  yj  +  +  «  v  p  <  n  . 

The  proof  is  concluded  by  noticing  that  (2.1)  may  be  written  as: 

(5.12)  zn+1 1  +  +  u>  I  ZS  +  u  •  □ 

Theorem  5.2.  If  y^  1  u,  then  condition  (5.7)  implies  that  y 
does  not  linearly  cause  z. 

Proof.  Step  1.  We  first  prove  that  the  assumptions  imply: 

(5.13)  *»>°  *0.1  1  *0  I  e0  +  “  • 

indeed,  (5.7)  is  equivalent  to: 
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V  n  >  0 


<£S  +  u)±  Zn+1  1  nS 

n  1  , 

*  (E0  +  u)  2n+l  -1-  np  V  (p»n)  °iPin 

»  (e“  +  u)  zn+1  1  ^o”1  +  zo_1  +  yp  v  (P*n>  0  1  P  £  n 

«  (e"  +  u/  zq+1  ]_  yp  |  yp-1  +  zP_1  +  u  V  (p,n)  0  <  p  <  n 

=»  (e”  +  u)L  zn+1  1  y  |  y°  +  ZQ  +  u  (theorem  A. 9) 

V  (p,n)  0  <  p  £  n 

ii  ^  i  I  0 

»  (eQ  +  u)  zr+1  1  yp  I  z0  +  u  (because  yQ  €  u) 

¥  (p,n)  0  <  p  <  n 

-  Vi  J-  yp  1  eo  +  u  v  <p-n)  o  i  p  i  »  • 

The  last  step  Is  made  by  using  the  fact  that  z^  +  u  c  e°  +  u 
and  by  applying  theorem  A. 6. 

Step  2.  We  now  prove  (2.1)  by  induction.  It  is  clearly  true  for 
n  ■  0  under  the  hypothesis  y^  6  u.  Now  suppose  (2.1)  is  true  for 
n  £  p;  we  prove  that  (2.1)  is  also  true  for  n  -  p+1.  From  step  1  in 
theorem  5.1,  (5.8)  is  true  for  n  £  p  and,  from  step  2,  (5.10)  is  also 
true  for  n  <  p;  in  particular  ep  +  u  ■  zp  +  u;  therefore,  by  (5.14), 
(2.1)  is  true  for  n  ■  p+1.  □ 
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Theorem  S.l  shows  that  condition  (5.7)  is,  in  general,  stronger  than 
non-causality  but  theorems  5.2  shows  that  condition  (5.7)  is  actually  equi¬ 
valent  to  non-causality  if  the  initial  condition  on  y  is  included  in  u 

A  A 

(which  implies  that  both  expectations  and  yQ  may  involve  y^.) 

Extensions  of  non-causality  properties  can  be  easily  done  in  terms  of 
expected  or  unexpected  components  of  the  variables.  As  an  example  we  give 
the  following  version  of  theorem  3.1. 

Theorem  5.3.  If  y^  t  u,  y  does  not  linearly  cause  z  if  and 
only  if 

(5.14)  VnL0  *Z+1L%  |  e“  +  u  . 

Proof.  Using  theorems  5.1  and  5.2  we  just  have  to  prove  the  equi¬ 
valence  of  (5.14)  and  (5.7). 

(5.14)  implies  (5.7)  by  theorem  A. 2  (since  zq+^  •  '  (5.7) 

implies  (5.14)  by  using  theorem  A. 10  (note  that  if  y  does  not  linearly 
cause  z,  we  have  Vn  z^  +  u«  e^  +  u-  see  step  2  of  the  proof  of 
theorem  5.1).  □ 

Finally  let  us  note  that  the  property  "y  does  not  cause  z"  Implies 
the  following  conditional  orthogonality: 

(5.15)  v»>0  Vk>0  'n+k+i  i  nH+l  I  e0+k  +  n0  +  “ 

(by  theorem  5.1  and  corollary  A. 4).  This  property  is  actually  the 
property  tested  by  Sims  in  his  1980  paper. 
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Appendix.  Orthogonality. 

Let  X  -  (L,  <*,*>)  be  a  Hilbert  apace  on  R,  i.e.  L  Is  a  linear 

space  of  vectors  x^'s  and  <*,*>  Is  an  Inner  product  (bilinear,  symmetric 

and  positive  definite)  which  makes  L  complete.  Details  on  this  structure 

can  be  found  in  e.g.  Halmos  (1957)  or  Greub  (1975).  The  Hilbert  space  we 

use  in  the  main  body  of  this  paper  is  the  set  of  (classes  of)  random 

variables  y  defined  (up  to  an  almost  sure  equality)  on  a  probability 

2 

space  (fl,G,P)  and  such  that  E(y  )  is  finite.  The  inner  product  between 

y  and  z  is  then  defined  by  E(yz).  However,  definitions  and  results 

given  in  this  appendix  are  stated  in  terms  of  a  general  Hilbert  space. 

Let  E^  be  complete  (or,  equivalently,  closed)  linear  subspaces 

of  L;  where  in  particular  Eq  is  the  subspace  containing  the  null  vector 

only.  E^  +  Ej  denotes  the  usual  sum  of  subspaces,  x^  J_  denotes 

the  usual  orthogonality  w.r.t.  <•,♦>  (i.e.  <x1.x2>  ■  0).  Likewise 

x  1  E  means  x  J_  e  ¥  e  C  E  and  E^  E2  means  J.  e2  v  ei  *  , 

-L 

e.,  c  E2 •  E  denotes  the  orthogonal  complement  of  E,  i.e. 

E^~  -  (xC l|x  J_  e).  Note  that  Eq  -  L.  Finally,  Ex  denotes  the  orthogonal 
projection  of  the  vector  x  on  the  Bubspace  E,  i.e.  the  unique  vector 
eCE  such  that  (x-e)  J_  E.  Note  that  this  is  a  linear  Idempotent 
operation.  We  shall  make  use  of  the  following  property: 

<x^,Ex2>  ■  <Exltx2>  -  <Bc1,Ex2>  V  xlfx2  C  L  and  VE  . 

Note  that  Lx  *  x  and  EqX  ■  0.  This  notation  is  extended  as  follows: 

E2E2  means  the  projection  of  E^  on  E2.  Note  that  EqE^  “  Eq  and 
LE^  ■  E^.  We  shall  also  use: 
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E1CE2 


E1E2 


E2E1 


V 


We  shall  introduce  in  this  appendix  the  two  concepts  of  conditional 
orthogonality  and  biconditional  orthogonality.  They  are  not  new  concepts, 
as  they  are  merely  particular  cases  of  orthogonality,  but  they  provide 
convenient  notation  and  results  for  our  kind  of  probleais. 


A.l.  Theorem. 

The  following  properties  are  equivalent  and  define  "E^  and  Eg 
are  orthogonal  conditionally  on  E^"  which  is  denoted  as  "E^ _[_ Eg  |  E^": 

(i)  (Xj^.Xg)  C  x  Eg  *»  (xx  -  E^Xj^)  l  (xg  -  E3Xg) 

(or  Ex  J_E^  Eg  or  E^  E^Eg  or  Ex  J_  E3  Eg) 

(ii)  xx  «  Ex  *»  (Eg  +  Ej)  x1  -  b  x1 

(or  (Eg  +  E3)  Ex  -  E3Ex) 

(iii)  x3  «  Eg  ■»  (Ej^+Ej)  Xg  ■  EjXg 

(or  (Ej^  +  Ej)  Eg  ■  EjEg)  .  Q 

From  (1),  conditional  orthogonality  may  be  interpreted  as  the  usual 

X 

orthogonality  between  the  projections  of  E^  and  Eg  on  E3  .  In  a 
statistical  context,  these  projections  will  be  recognized  as  the  residuals 
in  regression  analysis. 
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A.2.  Elementary  properties. 


(1) 

E1 

1E2 

|  E0  «  Ej  1  E2 

(11) 

E4 

CE1 

and  E^  X  Ej  |  E3 

Imply 

E4  1  E2  1  E3 

(ill) 

E4 

CE3 

Implies  E^  J_  ®4  | 

E3 

VE1 

(lv) 

E4 

CE3 

and  El  1  E2  |  E3 

Imply 

(e1  +  e4)  1  e2  |  e3 

A. 3.  Fundamental  property  of  conditional  orthogonality. 
The  following  propertiea  are  equivalent: 


(1) 

(a)  Ex  1  E2  | 

E3 

and  (b)  E1lE4  | 

E3 

(11) 

(a)  El  1  E2  I 

E3 

and  (b)  ^  1  E4  | 

E2  +  E3 

(ill) 

Et  1  (E2  +  E4) 

1  e3 

• 

Proof. 

x4i  E2  +  E4, 

(ill)  *  (1)  by  A.2 

i.e.  x  ■  x2  +  x4> 

ill.  To  prove  that  (1)  =»  (111)  take 

Then,  by  the  llnearllty  of  projection 

(I)  lmpllea  that  (E^-fE^)  x  •  E^x.  Similarly,  (ill)  **  (11  a)  by  A.2 

111  and  (ill)  ■»  (11  b)  because,  by  (111),  (E2  +  E3  +  E4)  x^  ■  E^x^  V  x^CE^ 

which  Implies  (E2  +  E3)  -  E^  as  a  property  of  projections.  Finally 

(II)  *(111)  because  for  any  x1  •  E^,  (E2  +  E3  +  E4)  x^  •  (E2  +  E3>x1  -  EjX^ 
by  (11  b)  and  (11  a)  successively,  q 
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A.4.  Corollary . 


E4  c  E^  +  Ej  and  E^  X  E2  |  E3  Imply  E^  X  |  E^  +  E ^  □ 

A. 5.  Theorem. 

The  following  properties  are  equivalent  and  define  "E^  conditionally 
on  E2  end  conditionally  on  E4  are  orthogonal"  which  is  denoted  by 

"(Ej^  |  E2)  1  (E3  I  Ea)": 

(I)  ¥  (xltx3)  «  E1  *  E3  =»  (xx  -  E2x1)  J_  (x3-Eax3) 

JL  ,  -L 

(II)  E2  E1-l  E4  E3 

4 

(III)  Ej^  1  EA  E,  |  E2 

(iv)  E^  Ex  1  E3  I  E4  . 

A. 6.  Elementary  properties. 

(I)  Ex  1  e2  I  E3  ♦  (El  I  E3>  1  (E2  I  V 

in  particular:  E^  X  E2  «»  (E^  |  Eq)  X  (E2  |  Eq) 

(II)  E1  X  E2  |  E3  «  (Ex  |  Eq)  X  (E2  I  E3) 

«  Ex  1  (E2  1  E3) 

(ill)  EjCEj  and  (Ex  |  E2)  X  (E3  |  E4)  Imply  (Bj  |  Ej)  J_  (Ej  |  E4) 

(iv)  E4  c  E2  and  (Ex  |  E2)  X  (Ej  |  E4)  *  Ex  X  E3  (  Ej  . 
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j  A.  7.  Fundamental  property  of  biconditional  orthogonality. 

|  The  following  properties  are  equivalent: 

I 

i 


(i) 

(a) 

<E1 

1  E2)  ^  (E3  i 

E^)  and 

(b) 

<E1  1 

-1 

CM 

W 

(E5 

1  »,) 

(ii) 

(a) 

(Ei 

!  e2)  1  (e3  1 

E^)  and 

(b) 

(E!  1 

—1 

CM 

W 

(Esl 

VE. 

(iii) 

(Ex 

1  *2 

)  1  [(e3  +  e5)  | 

e41  • 

Proof.  From  theorem  A. 5  (iv)  one  may  replace  (E^  |  E2)  by 
-L 

Ej  E^  and  then  use  A. 3.  □ 


A. 8.  Theorem. 

Let  A  be  a  subset  of  L  and  E  be  the  dosed  linear  subspace 
of  L  generated  by  A.  Then  for  any  closed  linear  subspaces  E^,E2,E3 
the  following  two  properties  are  equivalent: 


(i) 


(ii) 


E1  1  E2  I  E3 


E  |  Ex  1  E2  |  E3 


Sequences  of  conditional  orthogonalities. 

A. 9.  Theorem. 

Let  (FQ)n  >  q  be  an  Increasing  sequence  (FR_^  c  FQ)  of  closed 
linear  subspaces  of  L.  Then,  for  any  E  and  G,  the  following  pro¬ 
perties  are  equivalent: 


T=5 
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(i) 


V  n  >  0 


E  1  F 


n-1 


+  G 


(ii)  ¥n>0  E  i  Fn  I  +  G  • 


Proof,  (ii)  implies  (i)  by  A.4.  The  converse  follows  from  the 

property:  V  n  >  0,  V  q  «  0, . .  »,n  E  1  F |  F  which  is  proved 

by  lnductlon.lt  is  clearly  true  for  q  ■  1.  Let  us  assume  this  property 

for  general  q  and  note  that  (i)  implies  E  1  F  I  F  .  .,v+C.  The 

n*-q  n-{.q+i; 

result  follows  from  application  of  A. 3  and  A. 4.  Q 


A.10.  Theorem. 

Let  >  q  and  (FQ)n  >  o  be  increasin8  sequences  of  closed 

linear  subspaces  of  L  such  that  V  n  F  c  E  .  E  is  the  closed  sub- 

n  n  • 

space  generated  by  U  E  .  Then  the  following  properties  are  equivalent: 

n>0  n 

(1)  »  n  >  0  En+1  .L  F„  |  E„  +  G 

Cii)  lull  E  1  P_  I  *  +  0  . 

—  oo  n  n 

Proof,  (ii)  Implies  (i)  by  A.2.  The  converse  follows  by  using 
A. 6  and  the  property: 

*  »  >  o  »  P  >  1  V  1  'n  1  E„  +  C  • 

This  property  is  proved  by  Induction.  It  is  true  for  p  ■  1.  Let  us 

assume  this  property  is  true  for  general  p.  (i)  Implies 

Ep+pil  ^  Fn+p  I  En+p  +  G  and  A. 4  Implies  the  result.  □ 
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A. 11.  Theorem. 


With  the  same  definitions  as  in  the  preceding  theorem  the  following 
two  properties  are  equivalent: 


<»  v.  J-  I  +  Vi +  0 

(ii)  V  n  >  0  1  Fn  |  En  +  Fq  +  G. 


The  proof  is  essentially  the  same  as  the  proof  of  theorem  A. 6. 
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